Method and system for aligning a digital model of a structure with a video stream

ABSTRACT

A method of aligning a digital model of a structure with a displayed portion of the structure within a video stream is disclosed. An approximate position of the camera device in the digital model is determined. A position and an orientation are determined for a plurality of digital surfaces within the digital model visible from the approximate position of the camera device. A position and an orientation of a plurality of object surfaces visible in a video stream are determined. A 3D translation, a 3D scale, and a 3D rotation that maximize an alignment of the position and orientation of the plurality of digital surfaces with the position and orientation of the plurality of object surfaces are determined. The 3D translation, the 3D scale, and the 3D rotation are applied to the digital model and the digital model is displayed contemporaneously with a display of the video stream.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of and claims the benefit of priorityunder 35 U.S.C. § 120 to U.S. Pat. Application Serial No. 17/166,598,filed on Feb. 3, 2021, which claims the benefit of U.S. ProvisionalApplication No. 62/969,537, filed Feb. 3, 2020, entitled “METHOD ANDSYSTEM FOR ALIGNING A DIGITAL MODEL OF A STRUCTURE WITH A VIDEO STREAM,”each of which is incorporated by reference herein in its entirety.

TECHNICAL FIELD

The subject matter disclosed herein generally relates to the technicalfield of computer systems, and in one specific example, to computersystems and methods for aligning a digital model of a structure within avideo stream.

BACKGROUND OF THE INVENTION

In architecture and engineering, aligning a digital model of a structurewith a real life representation of the structure is an important initialstep for building engineers to perform. Global positioning technologies(e.g., such as GPS) capable of locating a device using geospatialcoordinates determined from satellites are useful for outdoorpositioning but not precise enough for placement of a model nor is thetechnology useful for indoor positioning.

Furthermore, point cloud matching technologies for aligning digitalmodels can work on a small scale but have trouble when applied to largerscale, such as buildings. As an example, ICP (Interactive Closest Point)is an algorithm that maximizes the alignment of two point clouds inspace. Also, point cloud matching technologies are not able todifferentiate between many similar locations within a structure. Forexample, recognizing the shape of a door or a window from a camerawithin a building is not sufficient to know the specific door or windowof the building the camera is pointed at.

Other technologies such as motion capture trackers and image markersallow the alignment of a model with a scene, but these technologiesrequire 1) to have physical markers placed in the scene, and 2)modification of the structure model to specify the location of markerswithin the model so the model and the scene can be aligned. Some of thetechnologies use images of the scene as trackers, but the images onlywork in similar lighting conditions and the tracking images must also beplaced in the digital model to enable alignment.

BRIEF DESCRIPTION OF THE DRAWINGS

Features and advantages of example embodiments of the present inventionwill become apparent from the following detailed description, taken incombination with the appended drawings, in which:

FIG. 1A is a schematic illustrating a MR digital model alignment system,in accordance with one embodiment;

FIG. 1B is a schematic illustrating a head mounted MR digital modelalignment device, in accordance with one embodiment;

FIG. 2 is a flowchart illustrating a method for aligning a digital modelof a structure with a real-world structure using a MR digital modelalignment device, in accordance with one embodiment;

FIG. 3 is a flowchart of a method for computing a 3D translation, a 3Dscale and a 3D rotation with an MR digital model alignment system, inaccordance with one embodiment;

FIG. 4 is a flowchart of a method for computing a 3D translation, a 3Dscale and a 3D rotation with an MR digital model alignment system, inaccordance with one embodiment;

FIG. 5 is a flowchart of a method for computing a 2D translation, a 2Dscale and a 2D rotation with an MR digital model alignment system, inaccordance with one embodiment;

FIG. 6 is a block diagram illustrating an example software architecture,which may be used in conjunction with various hardware architecturesdescribed herein; and

FIG. 7 is a block diagram illustrating components of a machine,according to some example embodiments, configured to read instructionsfrom a machine-readable medium (e.g., a machine-readable storage medium)and perform any one or more of the methodologies discussed herein.

It will be noted that throughout the appended drawings, like featuresare identified by like reference numerals.

DETAILED DESCRIPTION

The description that follows describes example systems, methods,techniques, instruction sequences, and computing machine programproducts that comprise illustrative embodiments of the disclosure,individually or in combination. In the following description, for thepurposes of explanation, numerous specific details are set forth inorder to provide an understanding of various embodiments of theinventive subject matter. It will be evident, however, to those skilledin the art, that various embodiments of the inventive subject matter maybe practiced without these specific details.

The term ‘environment’ used throughout the description herein should beunderstood to include 2D digital environments (e.g., 2D video gameenvironments, 2D simulation environments, 2D content creationenvironments, and the like), 3D digital environments (e.g., 3D gameenvironments, 3D simulation environments, 3D content creationenvironments, virtual reality environments, and the like), and augmentedreality environments that include both a digital (e.g., virtual)component and a real-world component.

The term ‘digital object’, used throughout the description herein isunderstood to include any digital object or digital element within anenvironment. A digital object can represent (e.g., in a correspondingdata structure) almost anything within the environment; including 3Dmodels (e.g., characters, weapons, scene elements (e.g., buildings,trees, cars, treasures, and the like)) with 3D model textures,backgrounds (e.g., terrain, sky, and the like), lights, cameras, effects(e.g., sound and visual), animation, and more. The term ‘game object’may also be understood to include linked groups of individual gameobjects. A digital object is associated with data that definesproperties and behavior for the object.

The terms ‘asset’, ‘game asset’, and ‘digital asset’, used throughoutthe description herein are understood to include any data that can beused to describe a digital object or can be used to describe an aspectof a digital project (e.g., including: a game, a film, a softwareapplication). For example, an asset can include data for an image, a 3Dmodel (textures, rigging, and the like), a group of 3D models (e.g., anentire scene), an audio sound, a video, animation, a 3D mesh and thelike. The data describing an asset may be stored within a file, or maybe contained within a collection of files, or may be compressed andstored in one file (e.g., a compressed file), or may be stored within amemory. The data describing an asset can be used to instantiate one ormore digital objects within an environment (e.g., a game) at runtime.

The terms ‘client’ and ‘application client’ used throughout thedescription herein are understood to include a software client orsoftware application that accesses data and services on a server,including accessing over a network.

The term ‘mixed reality’ or ‘MR’ used throughout the description hereinshould be understood to include all combined environments in thespectrum between reality and virtual reality (VR) including virtualreality, augmented reality (AR) and augmented virtuality (AV).

A method of aligning a digital model of a structure with a displayedportion of the structure within a video stream captured by a cameradevice is disclosed. An approximate position of the camera device in thedigital model is determined. A position and an orientation aredetermined for a plurality of digital surfaces within the digital modelvisible from the approximate position of the camera device. A videostream is received from the camera device. A position and an orientationof a plurality of object surfaces visible in the video stream aredetermined. A 3D translation, a 3D scale, and a 3D rotation thatmaximize an alignment of the position and orientation of the pluralityof digital surfaces with the position and orientation of the pluralityof object surfaces are determined. The 3D translation, the 3D scale, andthe 3D rotation are applied to the digital model and the digital modelis displayed contemporaneously with a display of the video stream.

The present invention includes apparatuses which perform one or moreoperations or one or more combinations of operations described herein,including data processing systems which perform these methods andcomputer readable media which when executed on data processing systemscause the systems to perform these methods, the operations orcombinations of operations including non-routine and unconventionaloperations or combinations of operations.

A mixed reality (MR) digital model alignment system and associatedmethods are described herein. The MR digital model alignment system isconfigured to align a digital model of a structure with a representationof the structure within a video captured by a camera on a MR capabledevice wherein the camera is positioned to capture the real-worldstructure within an MR environment and whereby there are no predefinedmarkers. In an example embodiment, a user (e.g., a wearer of a headmounted display (HMD), or someone holding a smartphone, tablet, or otherMR-capable device) experiences the MR environment as presented by the MRdigital model alignment system via an MR capable device. The MRenvironment includes a view of the real world (e.g., immediatesurroundings of the MR capable device) along with virtual contentprovided by the MR digital model alignment system. The MR capabledevice, in some embodiments, includes a forward-facing camera configuredto capture digital video or images of the real world around the user,optionally including depth data, which the MR digital model alignmentsystem may analyze to provide some of the MR digital model alignmentfeatures described herein.

Turning now to the drawings, systems and methods, including non-routineor unconventional components or operations, or combinations of suchcomponents or operations, for aligning a digital model within a videosource without using predefined markers in accordance with embodimentsof the invention are illustrated. In accordance with an embodiment, FIG.1A is a diagram of an example MR digital model alignment system 100 andassociated devices configured to provide MR digital model alignmentsystem functionality to a user 102. In the example embodiment, the MRdigital model alignment system 100 includes a MR digital model alignmentdevice 104, or ‘MR alignment device’ 104 operated by the user 102 and aMR alignment server device 130 coupled in networked communication withthe MR alignment device 104 via a network 150 (e.g., a cellular network,a Wi-Fi network, the Internet, and so forth) . The MR alignment device104 is a computing device capable of providing a mixed realityexperience to the user 102.

In the example embodiment, the MR alignment device 104 includes one ormore central processing units (CPUs) 106, graphics processing units(GPUs) 108, and/or holographic processing units (HPUs) 110. Theprocessing device 106 is any type of processor, processor assemblycomprising multiple processing elements (not shown), having access to amemory 122 to retrieve instructions stored thereon, and execute suchinstructions. Upon execution of such instructions, the instructionsimplement the processing device 106 to perform a series of tasks asdescribed herein in reference to FIG. 2 , FIG. 3 , FIG. 4 , and FIG. 5 .The MR alignment device 104 also includes one or more networking devices112 (e.g., wired or wireless network adapters) for communicating acrossthe network 150. The MR alignment device 104 further includes one ormore camera devices 114 which may be configured to capture digital videoof the real world near the MR alignment device 104 during operation. TheMR alignment device 104 may also include one or more sensors 116, suchas a global positioning system (GPS) receiver (e.g., for determining aGPS location of the MR alignment device 104), biometric sensors (e.g.,for capturing biometric data of the user 102 or an additional person),motion or position sensors (e.g., for capturing position data of the MRalignment device 14, the user 102 or other objects), or an audiomicrophone (e.g., for capturing sound data). Some sensors 116 may beexternal to the MR alignment device 104, and may be configured towirelessly communicate with the MR alignment device 104 (e.g., such asused in the Microsoft Kinect®, Vive Tracker™, MIT’s Lidar sensor, orMIT’s wireless emotion detector).

The MR alignment device 104 also includes one or more input devices 118such as, for example, a keyboard or keypad, a mouse, a pointing device,a touchscreen, a hand-held device (e.g., hand motion tracking device), amicrophone, a camera, and the like, for inputting information in theform of a data signal readable by the processing device 106. The MRalignment device 104 further includes one or more display devices 120,such as a touchscreen of a tablet or smartphone, or lenses or visor of aVR or AR HMD, which may be configured to display virtual objects to theuser 102 in conjunction with a real world view.

The MR alignment device 104 also includes a memory 122 configured tostore a client MR digital model alignment module (“client module”) 124.The memory 122 can be any type of memory device, such as random accessmemory, read only or rewritable memory, internal processor caches, andthe like.

In the example embodiment, the camera device 114 and sensors 116 capturedata from the surrounding environment, such as video, audio, depthinformation, GPS location, and so forth. The client module 124 may beconfigured to analyze the sensor data directly, or analyze processedsensor data (e.g., a realtime list of detected and identified objects,object shape data, depth maps, and the like).

In accordance with an embodiment, the memory may also store a gameengine (e.g., not shown in FIG. 1A) (e.g., executed by the CPU 106 orGPU 108) that communicates with the display device 120 and also withother hardware such as the input/output device(s) 118 to present a 3Dsimulation environment (e.g., a virtual reality environment, a mixedreality environment, and the like) to the user 102. The game enginewould typically include one or more modules that provide the following:simulation of a virtual environment and game objects therein (e.g.,including animation of digital objects, animation physics for digitalobjects, collision detection for digital objects, and the like),rendering of the virtual environment and the digital objects therein,networking, sound, and the like in order to provide the user with acomplete or partial virtual environment (e.g., including video gameenvironment or simulation environment).

In accordance with an embodiment, the MR Alignment server 130 includes amemory 132 storing a server MR digital model alignment module (“servermodule”) 134. During operation, the client MR digital model alignmentmodule 124 and the server MR digital model alignment module 134 performthe various MR digital model alignment functionalities described herein.More specifically, in some embodiments, some functionality may beimplemented within the client module 124 and other functionality may beimplemented within the server module 134.

In accordance with some embodiments, the MR alignment device 104 is amobile computing device, such as a smartphone or a tablet computer. Inaccordance with another embodiment, and as shown in FIG. 1B, the MRalignment device 104 may be a head-mounted display (HMD) device worn bythe user 102, such as an augmented reality (AR) or virtual reality (VR)visor (e.g., Google Glass®, HTC Vive®, Microsoft HoloLens®, thePlaystation VR™, Oculus Rift™, and so forth). In the example embodiment,the user 102 (e.g., a game developer) experiences a VR environment oraugmented reality (AR) environment while wearing the HMD MR alignmentdevice 104. During operation, in the example embodiment, the HMD MRalignment device 104 is mounted on a head of the wearer 102, and overboth eyes of the wearer 102, as shown in FIG. 1B. The wearer 102 may bepresented with a virtual environment which may be viewed and edited viathe HMD 104 and handhelds as described herein. The HMD MR alignmentdevice 104 includes a transparent or semitransparent visor (or “lens” or“lenses”) 124 through which the wearer 102 views their surroundings(also herein referred to as “the real world”). In other embodiments, theHMD MR alignment device 104 may include an opaque visor 124 which mayobscure the wearer 102’s view of the real world and on which a completevirtual environment is displayed (e.g., using video from the cameradevice 114).

In accordance with an embodiment, the HMD MR alignment device 104 shownin FIG. 1B includes components similar to the MR alignment device 104discussed in relation to FIG. 1A. For example, the HMD MR alignmentdevice 104 shown in FIG. 1B includes a display device 120, a networkingdevice 112, a camera device 114, a CPU 106, a GPU 108, a memory 122,sensors 116, and one or more input devices 118 (not explicitly shown inFIG. 1B). In the example embodiment, the display device 120 may rendergraphics (e.g., virtual objects) onto the visor 124. As such, the visor124 acts as a “screen” or surface on which the output of the displaydevice 120 appears, and through which the wearer 102 experiences virtualcontent. The display device 120 may be driven or controlled by one ormore graphical processing units (GPUs) 108. In accordance with someembodiments, the display device 120 may include the visor 124.

In some embodiments, the digital camera device (or just “camera”) 114 onthe MR alignment device 104 is a forward-facing video input device thatis oriented so as to capture at least a portion of a field of view (FOV)of the wearer 102. In other words, the camera 114 captures or “sees” anangle of view of the real world based on the orientation of the HMDdevice 104 (e.g., similar to what the wearer 102 sees in the wearer102’s FOV when looking through the visor 124). The camera device 114 maybe configured to capture real-world digital video around the wearer 102(e.g., a field of view, a peripheral view, or a 360° view around thewearer 102). In some embodiments, output from the digital camera device114 may be projected onto the visor 124 (e.g., in opaque visorembodiments), and may also include additional virtual content (e.g.,added to the camera output). In some embodiments there can also be adepth camera on the HMD 104.

In some embodiments, the HMD MR alignment device 104 may include one ormore sensors 116, or may be coupled in wired or wireless communicationwith the sensors. For example, the HMD MR alignment device 104 mayinclude motion or position sensors configured to determine a position ororientation of the HMD 104. In some embodiments, the HMD MR alignmentdevice 104 may include a microphone for capturing audio input (e.g.,spoken vocals of the user 102).

In some embodiments, the user 102 may hold one or more input devices 118including hand tracking devices (“handhelds”) (not separately shown inFIG. 1B) (e.g., one in each hand). The handhelds provide informationabout the absolute or relative position and orientation of a user’shands and, as such, are capable of capturing hand gesture information.The handhelds may be configured to operate directly with the HMD MRalignment device 104 (e.g., via wired or wireless communication). Insome embodiments, the handhelds may be Oculus Touch™ hand controllers,HTC Vive™ hand trackers, or Playstation VR™ hand controllers. Thehandhelds may also include one or more buttons or joysticks built intothe handheld. In other embodiments, the user 102 may wear one or morewearable hand tracking devices (e.g., motion tracking gloves, notshown), such as those made commercially available by Manus VR(Netherlands). In still other embodiments, hand motion of the user 102may be tracked without, or in addition to, the handhelds or wearablehand tracking devices via a hand position sensor (not shown, e.g., usingoptical methods to track the position and orientation of the user’shands) such as, for example, those made commercially available by LeapMotion, Inc. (a California corporation). Such hand tracking devices(e.g., handhelds) track the position of one or more of the hands of theuser during operation.

In accordance with an embodiment, the methods and systems describedherein teach how an estimated global position of a device within astructure can be obtained and then refined with a local position bydetecting object surfaces in the device surroundings (e.g., via a videoof the surroundings) and aligning the object surfaces to digitalsurfaces present in a digital model representation of the structure. Thecombination of the global position and local position allows adetermination of device position precisely within the structure.

In accordance with another embodiment, a plurality of rays are cast fromthe determined device position in the digital model of the structure.The intersection of the rays with digital surfaces within the modeldefine a point cloud. Similarly, rays are also cast from a location ofthe device in a scene observed through a camera on the device to createa point cloud of the device surroundings with respect to the real-worldstructure as seen by the device camera. The two point clouds are thenaligned to compute a translation, a scale, and a rotation needed toalign the digital model with the real-world structure.

In accordance with another embodiment, an alignment of vertical axes inthe model with gravity and an alignment of lowest horizontal digitalsurfaces between the model and object surfaces found in the real-worldscene reduce a 3D alignment method to a 2D alignment method. In the 2Dalignment method, rays are only cast horizontally from the deviceposition both in the digital model and in the scene captured by thecamera, wherein the rays produce two point clouds in 2D (e.g., a firstassociated with the digital model and a second associated with the scenecaptured by the camera). Aligning the two point clouds in 2D onlyrequires a rotation in one dimension and a translation and scale in twodimensions, making convergence for the alignment more efficient.

In accordance with an embodiment, the methods described below withrespect to FIG. 2 , FIG. 3 , FIG. 4 and FIG. 5 work without motiontracking devices or using image markers within a real-world environment.The methods described with respect to FIG. 2 , FIG. 3 , FIG. 4 and FIG.5 allow a user with an MR alignment device 104 to visit a real-worldstructure for the first time, determine a precise location for thedevice within the structure, and position a digital model associatedwith the real-world structure in a video stream generated by the MRalignment device 104.

In accordance with an embodiment, a determination of global position anda determination of local position are performed in two stepscomplementing each other. An estimated global position of a device iscombined with a local geometry alignment method as described withrespect to FIG. 2 , FIG. 3 , FIG. 4 and FIG. 5 in order to align adigital model with a real world structure surrounding the device andcompute a precise position of the device within the structure. Theglobal position is used with local geometry matching on a local area ofthe structure, preventing the method from erroneously finding locationswith similar local geometry elsewhere within the structure.

In accordance with an embodiment, and as shown in FIG. 2 , is aflowchart diagram of a method 200 for aligning a digital model with areal-world structure via a MR alignment device 104. In variousembodiments, some of the method elements shown may be performedconcurrently, in a different order than shown, or may be omitted. Inaccordance with an embodiment, at operation 202 of the method 200, theclient MR digital model alignment module 124 determines an approximateposition for a MR alignment device 104 within a digital model, whereinthe digital model represents a real-world structure (e.g., a building)and wherein the MR alignment device 104 is physically within or inproximity of the real-world structure represented by the digital model.In accordance with one embodiment, the approximate position of the MRalignment device 104 can be determined using a Global Positioning System(GPS). For example, receiving a latitude, a longitude, and an altitudefor the MR alignment device 104 from a global positioning system andcomputing a position for the device 104 based on a comparison with apredetermined latitude, longitude, and altitude of a point within thedigital model (e.g., included in data describing the digital model). Inanother embodiment, the approximate position of the MR alignment device104 can be determined by receiving a touch input from the user 102 ontouch a screen (e.g., part of the display device 120) or by receiving aninput from the input device 118 (e.g., a pointing device) to indicatethe approximate position within a rendering of the digital modeldisplayed on the screen. In accordance with some embodiments, theprocess of indicating the approximate position using a touch input or aninput device may be split into several steps, including: selecting astructure (e.g., a specific building), selecting a first part of thestructure (e.g., a floor within a building), and selecting a second partof the structure within the first part (e.g., a room within the floor),and so forth. In accordance with some embodiments, the process ofindicating the approximate position using a touch input or an inputdevice can also include a selection of more precise structural elementssuch as doors, windows, pillars, and walls.

In accordance with an embodiment, at operation 204 of the method 200,the client MR digital model alignment module 124 determines a positionand an orientation for a plurality of digital surfaces within thedigital model from a perspective of the approximate position (e.g., asdetermined in operation 202) of the MR alignment device 104 within thedigital model.

In accordance with an embodiment, at operation 206 of the method 200,the client MR digital model alignment module 124 receives a video stream(e.g., an RGB video stream) from a camera device 114 in the MR alignmentdevice 104. The video stream may include video of an environmentsurrounding the MR alignment device 104 (e.g., as the MR alignmentdevice 104 is moved through the environment by a user of the device),wherein the environment includes the real-world structure.

In accordance with an embodiment, at operation 208 of the method 200,the client MR digital model alignment module 124 computes a position andan orientation for a plurality of object surfaces of the real-worldstructure visible within the video stream, the position and orientationbeing associated with a position and orientation of the MR alignmentdevice 104 within the real-world structure. In accordance with anembodiment, the position and orientation of object surfaces and theposition and orientation of the MR alignment device 104 within thereal-world structure are detected in the video stream using aSimultaneous Location And Mapping (SLAM) algorithm. The SLAM algorithmmay use data from the sensors 116 (e.g., accelerometer data andgyroscope data) in addition to the video stream data.

In accordance with an embodiment, at operation 210 of the method 200,the client MR digital model alignment module 124 determines a 3Dtranslation, a 3D scale and a 3D rotation to maximize an alignment ofthe position and orientation of the plurality of digital surfaces (e.g.,determined within operation 204) and the position and orientation of theplurality of object surfaces (e.g., determined within operation 208) .

In accordance with an embodiment, at operation 212 of the method 200,the client MR digital model alignment module 124 applies the 3Dtranslation, the 3D scale, and the 3D rotation (e.g., determined inoperation 210) to the digital model and displays the translated, scaledand rotated digital model on the display device 120 of the MR alignmentdevice 104. In accordance with an embodiment, as part of operation 212,the client MR digital model alignment module 124 overlays the display ofthe translated, scaled and rotated digital model on a display of thereceived camera video stream. In accordance with an embodiment, due tothe determination of translation, scale and rotation (e.g., fromoperation 210), some of the plurality of digital surfaces within thedigital model (e.g., from operation 204) are aligned in the overlay withassociated visible object surfaces (e.g., from operation 208) within thevideo stream. As such, the displayed overlay of the digital model of thestructure aligns with a view of the real-world structure within thevideo stream.

In accordance with an embodiment, and as shown in FIG. 3 , is aflowchart diagram of a method 300 for computing a 3D translation, a 3Dscale and a 3D rotation as part of operation 210, wherein the method 300includes an alignment of two point clouds. In accordance with anembodiment, at operation 302 of the method 300, the client MR digitalmodel alignment module 124 digitally casts a plurality of rays withinthe digital model in a plurality of directions from the approximateposition of the MR alignment device 104 within the digital model (e.g.,the approximate position as determined as part of operation 202 of themethod 200). In accordance with an embodiment, a ray of the plurality ofrays may be cast out to a maximum predetermined distance from theapproximate position. At operation 304 of the method 300, the client MRdigital model alignment module 124 generates a first 3D point cloudwhich includes a point for each location whereby a ray from theplurality of rays intersects with a digital surface within the pluralityof digital surfaces in the digital model (e.g., the plurality of digitalsurfaces referred to in operation 204). In accordance with anembodiment, at operation 306 of the method 300, the client MR digitalmodel alignment module 124 computes a 3D position for the MR alignmentdevice 104 (e.g., relative to the camera device 114) in the video streamrelative to the plurality of object surfaces visible in the video stream(e.g., the plurality of object surfaces referred to in operation 208).In accordance with an embodiment, at operation 308 of the method 300,the client MR digital model alignment module 124 digitally casts aplurality of rays from the computed camera position (e.g., determined inoperation 306) out in a plurality of directions towards the plurality ofobject surfaces visible within the video stream. In accordance with anembodiment, at operation 310 of the method 300, the client MR digitalmodel alignment module 124 generates a second 3D point cloud whichincludes a point for each location whereby a ray of the plurality ofrays from operation 308 intersects with a object surface within theplurality of object surfaces visible from the video stream. Inaccordance with an embodiment, at operation 312 of the method 300, theclient MR digital model alignment module 124 computes a 3D translation,a 3D scale and a 3D rotation that maximize an alignment of the firstpoint cloud and the second point cloud. In accordance with anembodiment, operation 312 involving an alignment of the first pointcloud (e.g., representing digital surfaces within the digital model)with the second point cloud (e.g., representing object surfaces in thereal-world structure captured by the camera device 114) may be performedusing an ICP (Iterative Closest Point) computation. Embodiments of thispresent disclosure are not limited in this regard. Any alignment methodcan be used to align the first point cloud with the second point cloud.

In accordance with an example embodiment, at operation 312 of the method300, a scale of the digital model may be assumed to be substantiallyequal to a scale of the real-world scene captured by the camera device114 (e.g., based on a generation of the digital model using a real-worldscale for digital objects therein). In the example embodiment, atoperation 312 it is then sufficient to solve only for translation androtation in order to align the digital model with the real-world scene;accordingly, the client MR digital model alignment module 124 computes a3D translation and a 3D rotation that maximize an alignment of the firstpoint cloud and the second point cloud.

In accordance with an embodiment, and shown in FIG. 4 and FIG. 5 , the3D alignment determination from operation 210 is reduced to a 2Dalignment determination by assuming that one or more lowest horizontaldigital surfaces (e.g., determined below in operation 404) in thedigital model and that one or more lowest horizontal object surfaces(e.g., determined below in operation 404) in a real-world structure orscene (e.g., as seen via a video) both represent a same floor (e.g., ora ground) and are perpendicular to the direction of gravity and are atthe same altitude. With the assumption, rotation is then constrainedaround an axis of gravity and translation is constrained to be parallelto the horizontal surfaces, solving one dimension of the problem. Inexample 2D embodiments described with respect to FIG. 4 and FIG. 5 , avertical axis in the digital model of the structure and a vertical axisin the structure within the received video are assumed to be parallel toan axis of gravity. The assumption solves two of the three possiblerotation dimensions needed to align the digital model with a structurein reality, leaving a rotation around the axis of gravity as the onlyrotational degree of freedom.

In accordance with an embodiment, and as shown in FIG. 4 , is aflowchart diagram of a method 400 for computing a 3D translation, a 3Dscale and a 3D rotation as part of operation 210. In variousembodiments, some of the method 400 elements shown may be performedconcurrently, in a different order than shown, or may be omitted. Inaccordance with an embodiment, at operation 402 of the method 400, afirst vertical axis is determined for the digital model and a secondvertical axis is determined for the real-world structure. In accordancewith an embodiment, the second vertical axis may be determined from ananalysis of the video stream data received in operation 206, wherein theanalysis may include an analysis of the object surfaces visible withinthe video stream as determined in operation 208. In accordance with anembodiment, the operations of the method 400 are performed with anassumption that the first vertical axis of the digital model and thesecond vertical axis of the real-world structure are parallel to eachother and parallel to the direction of the force of gravity (e.g., theassumption is equivalent to aligning a direction of gravity in both thedigital model and the real-world structure such that ‘up’ and ‘down’ arealigned in both). In accordance with an embodiment, as part of operation402 of the method 400, the client MR digital model alignment module 124restricts rotation of the digital model to a rotation around the firstvertical axis. In accordance with an embodiment, at operation 404 of themethod 400, the client MR digital model alignment module 124 determinesa lowest horizontal digital surface from the plurality of digitalsurfaces in the digital model (e.g., one or more horizontal digitalsurfaces which have or share a lowest position as determined inoperation 204) and determines a lowest horizontal object surface fromthe plurality of object surfaces visible in the video stream (e.g., oneor more horizontal object surfaces which have or share a lowest positionas determined in operation 208) . In accordance with an embodiment,operation 404 may be understood as a search for a local floor in boththe digital model and the real-world structure. In accordance with anembodiment, at operation 406 of the method 400, the client MR digitalmodel alignment module 124 computes a translation of the digital modelso that the lowest horizontal digital surface in the plurality ofdigital surfaces in the digital model (e.g., as determined in operation404) is aligned with the lowest horizontal object surface of theplurality of object surfaces visible in the video stream (e.g., asdetermined in operation 404). In accordance with an embodiment, thetranslation may be a vertical translation of the digital model such thata floor in the digital model aligns with an associated floor in thereal-world structure as seen in the video stream. At operation 408 ofthe method 400, the client MR digital model alignment module 124computes a 2D translation, a 2D scale and a 2D rotation that maximize analignment of digital surfaces within the plurality of digital surfacesof the digital model with object surfaces of the plurality of objectsurfaces visible on the video stream. In accordance with an embodiment,the optimization of the translation, scale and rotation determined inoperation 408 are 2D optimizations due to an alignment of lowesthorizontal surfaces (e.g., in operation 406) and an alignment of thefirst vertical and second vertical axes in operation 402. At operation410 of the method 400, the client MR digital model alignment module 124combines the vertical translation determined in operation 406, thealignment of the first and second vertical axes from operation 402, the2D translation, the 2D scale, and the 2D rotation from operation 408 toobtain a 3D translation, a 3D scale and a 3D rotation.

In accordance with an example embodiment, at operation 408 of the method400, a scale of the digital model may be assumed to be substantiallyequal to a scale of the real-world scene captured by the camera device114 (e.g., based on a creation of the digital model using a real-worldscale for digital objects therein). In the example embodiment, atoperation 408 of the method 400, it is then sufficient to solve only fortranslation and rotation in order to align the digital model with thereal-world scene; accordingly, the client MR digital model alignmentmodule 124 computes a 2D translation and a 2D rotation that maximize analignment of digital surfaces within the plurality of digital surfacesof the digital model with object surfaces of the plurality of objectsurfaces visible on the video stream.

In accordance with an embodiment, and as shown in FIG. 5 , is aflowchart diagram of a method 500 for computing a 2D translation, a 2Dscale and a 2D rotation as part of operation 408. In variousembodiments, some of the method 500 elements shown may be performedconcurrently, in a different order than shown, or may be omitted. Inaccordance with an embodiment, at operation 502 of the method 500, anobservation height is determined, wherein the observation height maycorrespond to a height of the MR alignment device 104 above a floor inthe real-world structure (e.g., or ground) underneath the device 104. Inaccordance with an embodiment, the observation height may be determinedin one of the following ways: taken from a predetermined value (e.g.,stored in a memory), received as input from a user (e.g., a value inputvia an input device 118), determined by an AI agent (e.g., an AI agenttrained to determine an observation height), estimated as a height ofthe MR alignment device 104 above a ground (e.g., or floor) from ananalysis of the received video stream data or sensor device 116 data(e.g., as part of operation 202 or 208). In accordance with anembodiment, at operation 504 of the method 500, the client MR digitalmodel alignment module 124 computes an observation position in thedigital model that corresponds to a distance of the observation heightabove the lowest horizontal digital surface (e.g., the lowest horizontaldigital surface as determined in operation 404) and located on a lineprojected vertically (e.g., upward or downward) from the position of theMR alignment device 104 within the digital model (e.g., the position asdetermined in operation 202). In accordance with an embodiment, atoperation 506 of the method 500, the client MR digital model alignmentmodule 124 casts (e.g., digitally projects) a plurality of rayshorizontally outward from the observation position in the digital modeland creates a first 2D point cloud which includes a point for eachlocation whereby a ray from the plurality of rays intersects a digitalsurface from the plurality of digital surfaces within the digital model(e.g., wherein the digital surfaces may be determined in operation 204).In accordance with an embodiment, a ray may be cast outward by apredetermined distance and may be removed from calculations in otheroperations if no intersection with a digital surface occurs along thedistance. In accordance with an embodiment, at operation 508 of themethod 500, the client MR digital model alignment module 124 computes asecond observation position in the video stream (e.g., within acoordinate system associated with the video stream) wherein the positioncorresponds to a distance of the observation height above the lowesthorizontal object surface (e.g., the lowest horizontal object surface asdetermined in operation 404) and located on a line projected vertically(e.g., upward or downward) from a position associated with the MRalignment device 104 camera within the video stream (e.g., the positionas determined in operation 208). In accordance with an embodiment, atoperation 510 of the method 500, the client MR digital model alignmentmodule 124 casts (e.g., digitally projects) a second plurality of rayshorizontally outward from the second observation position in the videostream and creates a second 2D point cloud which includes a point foreach location whereby a ray from the second plurality of rays intersectsan object surface from the plurality of object surfaces within the videostream (e.g., wherein the object surfaces may be determined withinoperation 208). In accordance with an embodiment, at operation 512 ofthe method 500, the client MR digital model alignment module 124computes a 2D translation, a 2D scale, and a 2D rotation that maximizean alignment of the first point cloud and the second point cloud. Inaccordance with an embodiment, operation 512 involving an alignment ofthe first point cloud (e.g., representing digital surfaces within thedigital model) with the second point cloud (e.g., representing objectsurfaces in the real-world structure captured by the camera device 114)may be performed using an ICP (Iterative Closest Point) computation.Embodiments of this present disclosure are not limited in this regard.Any alignment method can be used to align the first point cloud with thesecond point cloud.

In accordance with an example embodiment, at operation 512 in the method500, a scale of the digital model may be assumed to be substantiallyequal to a scale of the real-world scene captured by the camera device114 (e.g., based on a generation of the digital model using a real-worldscale for digital objects therein). In the example embodiment, atoperation 512 of the method 500, it is then sufficient to solve only fortranslation and rotation in order to align the model with the real-worldscene; accordingly, the client MR digital model alignment module 124computes a 2D translation and a 2D rotation that maximize an alignmentof the first point cloud and the second point cloud.

While illustrated in the block diagrams as groups of discrete componentscommunicating with each other via distinct data signal connections, itwill be understood by those skilled in the art that the variousembodiments may be provided by a combination of hardware and softwarecomponents, with some components being implemented by a given functionor operation of a hardware or software system, and many of the datapaths illustrated being implemented by data communication within acomputer application or operating system. The structure illustrated isthus provided for efficiency of teaching the present variousembodiments.

It should be noted that the present disclosure can be carried out as amethod, can be embodied in a system, a computer readable medium or anelectrical or electro-magnetic signal. The embodiments described aboveand illustrated in the accompanying drawings are intended to beexemplary only. It will be evident to those skilled in the art thatmodifications may be made without departing from this disclosure. Suchmodifications are considered as possible variants and lie within thescope of the disclosure.

Certain embodiments are described herein as including logic or a numberof components, modules, or mechanisms. Modules may constitute eithersoftware modules (e.g., code embodied on a machine-readable medium or ina transmission signal) or hardware modules. A “hardware module” is atangible unit capable of performing certain operations and may beconfigured or arranged in a certain physical manner. In various exampleembodiments, one or more computer systems (e.g., a standalone computersystem, a client computer system, or a server computer system) or one ormore hardware modules of a computer system (e.g., a processor or a groupof processors) may be configured by software (e.g., an application orapplication portion) as a hardware module that operates to performcertain operations as described herein.

In some embodiments, a hardware module may be implemented mechanically,electronically, or with any suitable combination thereof. For example, ahardware module may include dedicated circuitry or logic that ispermanently configured to perform certain operations. For example, ahardware module may be a special-purpose processor, such as afield-programmable gate array (FPGA) or an Application SpecificIntegrated Circuit (ASIC). A hardware module may also includeprogrammable logic or circuitry that is temporarily configured bysoftware to perform certain operations. For example, a hardware modulemay include software encompassed within a general-purpose processor orother programmable processor. Such software may at least temporarilytransform the general-purpose processor into a special-purposeprocessor. It will be appreciated that the decision to implement ahardware module mechanically, in dedicated and permanently configuredcircuitry, or in temporarily configured circuitry (e.g., configured bysoftware) may be driven by cost and time considerations.

Accordingly, the phrase “hardware module” should be understood toencompass a tangible entity, be that an entity that is physicallyconstructed, permanently configured (e.g., hardwired), or temporarilyconfigured (e.g., programmed) to operate in a certain manner or toperform certain operations described herein. As used herein,“hardware-implemented module” refers to a hardware module. Consideringembodiments in which hardware modules are temporarily configured (e.g.,programmed), each of the hardware modules need not be configured orinstantiated at any one instance in time. For example, where a hardwaremodule comprises a general-purpose processor configured by software tobecome a special-purpose processor, the general-purpose processor may beconfigured as respectively different special-purpose processors (e.g.,comprising different hardware modules) at different times. Software mayaccordingly configure a particular processor or processors, for example,to constitute a particular hardware module at one instance of time andto constitute a different hardware module at a different instance oftime.

Hardware modules can provide information to, and receive informationfrom, other hardware modules. Accordingly, the described hardwaremodules may be regarded as being communicatively coupled. Where multiplehardware modules exist contemporaneously, communications may be achievedthrough signal transmission (e.g., over appropriate circuits and buses)between or among two or more of the hardware modules. In embodiments inwhich multiple hardware modules are configured or instantiated atdifferent times, communications between such hardware modules may beachieved, for example, through the storage and retrieval of informationin memory structures to which the multiple hardware modules have access.For example, one hardware module may perform an operation and store theoutput of that operation in a memory device to which it iscommunicatively coupled. A further hardware module may then, at a latertime, access the memory device to retrieve and process the storedoutput. Hardware modules may also initiate communications with input oroutput devices, and can operate on a resource (e.g., a collection ofinformation).

The various operations of example methods described herein may beperformed, at least partially, by one or more processors that aretemporarily configured (e.g., by software) or permanently configured toperform the relevant operations. Whether temporarily or permanentlyconfigured, such processors may constitute processor-implemented modulesthat operate to perform one or more operations or functions describedherein. As used herein, “processor-implemented module” refers to ahardware module implemented using one or more processors.

Similarly, the methods described herein may be at least partiallyprocessor-implemented, with a particular processor or processors beingan example of hardware. For example, at least some of the operations ofa method may be performed by one or more processors orprocessor-implemented modules. Moreover, the one or more processors mayalso operate to support performance of the relevant operations in a“cloud computing” environment or as a “software as a service” (SaaS).For example, at least some of the operations may be performed by a groupof computers (as examples of machines including processors), with theseoperations being accessible via a network (e.g., the Internet) and viaone or more appropriate interfaces (e.g., an application programinterface (API)).

The performance of certain of the operations may be distributed amongthe processors, not only residing within a single machine, but deployedacross a number of machines. In some example embodiments, the processorsor processor-implemented modules may be located in a single geographiclocation (e.g., within a home environment, an office environment, or aserver farm). In other example embodiments, the processors orprocessor-implemented modules may be distributed across a number ofgeographic locations.

FIG. 6 is a block diagram 600 illustrating an example softwarearchitecture 602, which may be used in conjunction with various hardwarearchitectures herein described to provide components of the MR digitalmodel alignment system 100. FIG. 6 is a non-limiting example of asoftware architecture and it will be appreciated that many otherarchitectures may be implemented to facilitate the functionalitydescribed herein. The software architecture 602 may execute on hardwaresuch as a machine 700 of FIG. 7 that includes, among other things,processors 710, memory 730, and input/output (I/O) components 750. Arepresentative hardware layer 604 is illustrated and can represent, forexample, the machine 700 of FIG. 7 . The representative hardware layer604 includes a processing unit 606 having associated executableinstructions 608. The executable instructions 608 represent theexecutable instructions of the software architecture 602, includingimplementation of the methods, modules and so forth described herein.The hardware layer 604 also includes memory/storage 610, which alsoincludes the executable instructions 608. The hardware layer 604 mayalso comprise other hardware 612.

In the example architecture of FIG. 6 , the software architecture 602may be conceptualized as a stack of layers where each layer providesparticular functionality. For example, the software architecture 602 mayinclude layers such as an operating system 614, libraries 616,frameworks or middleware 618, applications 620 and a presentation layer644. Operationally, the applications 620 and/or other components withinthe layers may invoke application programming interface (API) calls 624through the software stack and receive a response as messages 626. Thelayers illustrated are representative in nature and not all softwarearchitectures have all layers. For example, some mobile or specialpurpose operating systems may not provide the frameworks/middleware 618,while others may provide such a layer. Other software architectures mayinclude additional or different layers.

The operating system 614 may manage hardware resources and providecommon services. The operating system 614 may include, for example, akernel 628, services 630, and drivers 632. The kernel 628 may act as anabstraction layer between the hardware and the other software layers.For example, the kernel 628 may be responsible for memory management,processor management (e.g., scheduling), component management,networking, security settings, and so on. The services 630 may provideother common services for the other software layers. The drivers 632 maybe responsible for controlling or interfacing with the underlyinghardware. For instance, the drivers 632 may include display drivers,camera drivers, Bluetooth® drivers, flash memory drivers, serialcommunication drivers (e.g., Universal Serial Bus (USB) drivers), Wi-Fi®drivers, audio drivers, power management drivers, and so forth dependingon the hardware configuration.

The libraries 616 may provide a common infrastructure that may be usedby the applications 620 and/or other components and/or layers. Thelibraries 616 typically provide functionality that allows other softwaremodules to perform tasks in an easier fashion than to interface directlywith the underlying operating system 614 functionality (e.g., kernel628, services 630 and/or drivers 632). The libraries 716 may includesystem libraries 634 (e.g., C standard library) that may providefunctions such as memory allocation functions, string manipulationfunctions, mathematic functions, and the like. In addition, thelibraries 616 may include API libraries 636 such as media libraries(e.g., libraries to support presentation and manipulation of variousmedia format such as MPEG4, H.264, MP3, AAC, AMR, JPG, PNG), graphicslibraries (e.g., an OpenGL framework that may be used to render 2D and3D graphic content on a display), database libraries (e.g., SQLite thatmay provide various relational database functions), web libraries (e.g.,WebKit that may provide web browsing functionality), and the like. Thelibraries 616 may also include a wide variety of other libraries 638 toprovide many other APIs to the applications 620 and other softwarecomponents/modules.

The frameworks 618 (also sometimes referred to as middleware) provide ahigher-level common infrastructure that may be used by the applications620 and/or other software components/modules. For example, theframeworks/middleware 618 may provide various graphic user interface(GUI) functions, high-level resource management, high-level locationservices, and so forth. The frameworks/middleware 618 may provide abroad spectrum of other APIs that may be utilized by the applications620 and/or other software components/modules, some of which may bespecific to a particular operating system or platform.

The applications 620 include built-in applications 640 and/orthird-party applications 642. Examples of representative built-inapplications 640 may include, but are not limited to, a contactsapplication, a browser application, a book reader application, alocation application, a media application, a messaging application,and/or a game application. Third-party applications 642 may include anyan application developed using the Android™ or iOS™ software developmentkit (SDK) by an entity other than the vendor of the particular platform,and may be mobile software running on a mobile operating system such asiOS™, Android™, Windows® Phone, or other mobile operating systems. Thethird-party applications 642 may invoke the API calls 624 provided bythe mobile operating system such as operating system 614 to facilitatefunctionality described herein.

The applications 620 may use built-in operating system functions (e.g.,kernel 628, services 630 and/or drivers 632), libraries 616, orframeworks/middleware 618 to create user interfaces to interact withusers of the system. Alternatively, or additionally, in some systems,interactions with a user may occur through a presentation layer, such asthe presentation layer 644. In these systems, the application/module“logic” can be separated from the aspects of the application/module thatinteract with a user.

Some software architectures use virtual machines. In the example of FIG.6 , this is illustrated by a virtual machine 648. The virtual machine648 creates a software environment where applications/modules canexecute as if they were executing on a hardware machine (such as themachine 700 of FIG. 7 , for example) . The virtual machine 648 is hostedby a host operating system (e.g., operating system 614) and typically,although not always, has a virtual machine monitor 646, which managesthe operation of the virtual machine 648 as well as the interface withthe host operating system (i.e., operating system 614). A softwarearchitecture executes within the virtual machine 648 such as anoperating system (OS) 650, libraries 652, frameworks 654, applications656, and/or a presentation layer 658. These layers of softwarearchitecture executing within the virtual machine 648 can be the same ascorresponding layers previously described or may be different.

FIG. 7 is a block diagram illustrating components of a machine 700,according to some example embodiments, configured to read instructionsfrom a machine-readable medium (e.g., a machine-readable storage medium)and perform any one or more of the methodologies discussed herein. Insome embodiments, the machine 700 is similar to the MR alignment device104. Specifically, FIG. 7 shows a diagrammatic representation of themachine 700 in the example form of a computer system, within whichinstructions 716 (e.g., software, a program, an application, an applet,an app, or other executable code) for causing the machine 700 to performany one or more of the methodologies discussed herein may be executed.As such, the instructions 716 may be used to implement modules orcomponents described herein. The instructions transform the general,non-programmed machine into a particular machine programmed to carry outthe described and illustrated functions in the manner described. Inalternative embodiments, the machine 700 operates as a standalone deviceor may be coupled (e.g., networked) to other machines. In a networkeddeployment, the machine 700 may operate in the capacity of a servermachine or a client machine in a server-client network environment, oras a peer machine in a peer-to-peer (or distributed) networkenvironment. The machine 700 may comprise, but not be limited to, aserver computer, a client computer, a personal computer (PC), a tabletcomputer, a laptop computer, a netbook, a set-top box (STB), a personaldigital assistant (PDA), an entertainment media system, a cellulartelephone, a smart phone, a mobile device, a wearable device (e.g., asmart watch), a smart home device (e.g., a smart appliance), other smartdevices, a web appliance, a network router, a network switch, a networkbridge, or any machine capable of executing the instructions 716,sequentially or otherwise, that specify actions to be taken by themachine 700. Further, while only a single machine 700 is illustrated,the term “machine” shall also be taken to include a collection ofmachines that individually or jointly execute the instructions 716 toperform any one or more of the methodologies discussed herein.

The machine 700 may include processors 710, memory 730, and input/output(I/O) components 750, which may be configured to communicate with eachother such as via a bus 702. In an example embodiment, the processors710 (e.g., a Central Processing Unit (CPU), a Reduced Instruction SetComputing (RISC) processor, a Complex Instruction Set Computing (CISC)processor, a Graphics Processing Unit (GPU), a Digital Signal Processor(DSP), an Application Specific Integrated Circuit (ASIC), aRadio-Frequency Integrated Circuit (RFIC), another processor, or anysuitable combination thereof) may include, for example, a processor 712and a processor 714 that may execute the instructions 716. The term“processor” is intended to include multi-core processor that maycomprise two or more independent processors (sometimes referred to as“cores”) that may execute instructions contemporaneously. Although FIG.7 shows multiple processors, the machine 700 may include a singleprocessor with a single core, a single processor with multiple cores(e.g., a multi-core processor), multiple processors with a single core,multiple processors with multiples cores, or any combination thereof.

The memory/storage 730 may include a memory, such as a main memory 732,a static memory 734, or other memory, and a storage unit 736, bothaccessible to the processors 710 such as via the bus 702. The storageunit 736 and memory 732, 734 store the instructions 716 embodying anyone or more of the methodologies or functions described herein. Theinstructions 716 may also reside, completely or partially, within thememory 732, 734, within the storage unit 736, within at least one of theprocessors 710 (e.g., within the processor’s cache memory), or anysuitable combination thereof, during execution thereof by the machine700. Accordingly, the memory 732, 734, the storage unit 736, and thememory of processors 710 are examples of machine-readable media 738.

As used herein, “machine-readable medium” means a device able to storeinstructions and data temporarily or permanently and may include, but isnot limited to, random-access memory (RAM), read-only memory (ROM),buffer memory, flash memory, optical media, magnetic media, cachememory, other types of storage (e.g., Erasable Programmable Read-OnlyMemory (EEPROM)) and/or any suitable combination thereof. The term“machine-readable medium” should be taken to include a single medium ormultiple media (e.g., a centralized or distributed database, orassociated caches and servers) able to store the instructions 716. Theterm “machine-readable medium” shall also be taken to include anymedium, or combination of multiple media, that is capable of storinginstructions (e.g., instructions 716) for execution by a machine (e.g.,machine 700), such that the instructions, when executed by one or moreprocessors of the machine 700 (e.g., processors 710), cause the machine700 to perform any one or more of the methodologies or operations,including non-routine or unconventional methodologies or operations, ornon-routine or unconventional combinations of methodologies oroperations, described herein. Accordingly, a “machine-readable medium”refers to a single storage apparatus or device, as well as “cloud-based”storage systems or storage networks that include multiple storageapparatus or devices. The term “machine-readable medium” excludessignals per se.

The input/output (I/O) components 750 may include a wide variety ofcomponents to receive input, provide output, produce output, transmitinformation, exchange information, capture measurements, and so on. Thespecific input/output (I/O) components 750 that are included in aparticular machine will depend on the type of machine. For example,portable machines such as mobile phones will likely include a touchinput device or other such input mechanisms, while a headless servermachine will likely not include such a touch input device. It will beappreciated that the input/output (I/O) components 750 may include manyother components that are not shown in FIG. 7 . The input/output (I/O)components 750 are grouped according to functionality merely forsimplifying the following discussion and the grouping is in no waylimiting. In various example embodiments, the input/output (I/O)components 750 may include output components 752 and input components754. The output components 752 may include visual components (e.g., adisplay such as a plasma display panel (PDP), a light emitting diode(LED) display, a liquid crystal display (LCD), a projector, or a cathoderay tube (CRT)), acoustic components (e.g., speakers), haptic components(e.g., a vibratory motor, resistance mechanisms), other signalgenerators, and so forth. The input components 754 may includealphanumeric input components (e.g., a keyboard, a touch screenconfigured to receive alphanumeric input, a photo-optical keyboard, orother alphanumeric input components), point based input components(e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, oranother pointing instrument), tactile input components (e.g., a physicalbutton, a touch screen that provides location and/or force of touches ortouch gestures, or other tactile input components), audio inputcomponents (e.g., a microphone), and the like.

In further example embodiments, the input/output (I/O) components 750may include biometric components 756, motion components 758,environmental components 760, or position components 762, among a widearray of other components. For example, the biometric components 756 mayinclude components to detect expressions (e.g., hand expressions, facialexpressions, vocal expressions, body gestures, or eye tracking), measurebiosignals (e.g., blood pressure, heart rate, body temperature,perspiration, or brain waves), identify a person (e.g., voiceidentification, retinal identification, facial identification,fingerprint identification, or electroencephalogram basedidentification), and the like. The motion components 758 may includeacceleration sensor components (e.g., accelerometer), gravitation sensorcomponents, rotation sensor components (e.g., gyroscope), and so forth.The environmental components 760 may include, for example, illuminationsensor components (e.g., photometer), temperature sensor components(e.g., one or more thermometers that detect ambient temperature),humidity sensor components, pressure sensor components (e.g.,barometer), acoustic sensor components (e.g., one or more microphonesthat detect background noise), proximity sensor components (e.g.,infrared sensors that detect nearby objects), gas sensors (e.g., gasdetection sensors to detection concentrations of hazardous gases forsafety or to measure pollutants in the atmosphere), or other componentsthat may provide indications, measurements, or signals corresponding toa surrounding physical environment. The position components 762 mayinclude location sensor components (e.g., a Global Position System (GPS)receiver component), altitude sensor components (e.g., altimeters orbarometers that detect air pressure from which altitude may be derived),orientation sensor components (e.g., magnetometers), and the like.

Communication may be implemented using a wide variety of technologies.The input/output (I/O) components 750 may include communicationcomponents 764 operable to couple the machine 700 to a network 780 ordevices 770 via a coupling 782 and a coupling 772 respectively. Forexample, the communication components 764 may include a networkinterface component or other suitable device to interface with thenetwork 780. In further examples, the communication components 764 mayinclude wired communication components, wireless communicationcomponents, cellular communication components, Near Field Communication(NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy),Wi-Fi® components, and other communication components to providecommunication via other modalities. The devices 770 may be anothermachine or any of a wide variety of peripheral devices (e.g., aperipheral device coupled via a Universal Serial Bus (USB)).

Moreover, the communication components 764 may detect identifiers orinclude components operable to detect identifiers. For example, thecommunication components 764 may include Radio Frequency Identification(RFID) tag reader components, NFC smart tag detection components,optical reader components (e.g., an optical sensor to detectone-dimensional bar codes such as Universal Product Code (UPC) bar code,multidimensional bar codes such as Quick Response (QR) code, Aztec code,Data Matrix, Dataglyph, MaxiCode, PDF417, Ultra Code, UCC RSS-2D barcode, and other optical codes), or acoustic detection components (e.g.,microphones to identify tagged audio signals). In addition, a variety ofinformation may be derived via the communication components 762, suchas, location via Internet Protocol (IP) geo-location, location viaWi-Fi® signal triangulation, location via detecting a NFC beacon signalthat may indicate a particular location, and so forth.

Throughout this specification, plural instances may implementcomponents, operations, or structures described as a single instance.Although individual operations of one or more methods are illustratedand described as separate operations, one or more of the individualoperations may be performed concurrently, and nothing requires that theoperations be performed in the order illustrated. Structures andfunctionality presented as separate components in example configurationsmay be implemented as a combined structure or component. Similarly,structures and functionality presented as a single component may beimplemented as separate components. These and other variations,modifications, additions, and improvements fall within the scope of thesubject matter herein.

The embodiments illustrated herein are described in sufficient detail toenable those skilled in the art to practice the teachings disclosed.Other embodiments may be used and derived therefrom, such thatstructural and logical substitutions and changes may be made withoutdeparting from the scope of this disclosure. The Detailed Description,therefore, is not to be taken in a limiting sense, and the scope ofvarious embodiments is defined only by the appended claims, along withthe full range of equivalents to which such claims are entitled.

As used herein, the term “or” may be construed in either an inclusive orexclusive sense. Moreover, plural instances may be provided forresources, operations, or structures described herein as a singleinstance. Additionally, boundaries between various resources,operations, modules, engines, and data stores are somewhat arbitrary,and particular operations are illustrated in a context of specificillustrative configurations. Other allocations of functionality areenvisioned and may fall within a scope of various embodiments of thepresent disclosure. In general, structures and functionality presentedas separate resources in the example configurations may be implementedas a combined structure or resource. Similarly, structures andfunctionality presented as a single resource may be implemented asseparate resources. These and other variations, modifications,additions, and improvements fall within the scope of embodiments of thepresent disclosure as represented by the appended claims. Thespecification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense.

1. (canceled)
 2. A system comprising: one or more computer processors;one or more computer memories; a set of instructions incorporated intothe one or more computer memories, the set of instructions configuringthe one or more computer processors to perform operations, theoperations comprising: receiving a video stream captured by a cameradevice, the video stream including a view of a structure, the structurebeing associated with a digital model; determining an approximateposition of a virtual camera in the digital model, wherein theapproximate position corresponds with the view; determining a positionand an orientation for one or more digital surfaces within the digitalmodel visible from the approximate position of the virtual camera;determining a position and an orientation of one or more object surfacesvisible in the video stream; limiting a rotation of the digital modelaround a vertical axis, aligning the vertical axis of the digital modelwith a second vertical axis determined from the video stream, anddetermining a vertical translation of the digital model so that a lowesthorizontal digital surface is aligned with a lowest horizontal objectsurface visible on the video stream; and applying the verticaltranslation and the limiting of the rotation to the digital model anddisplaying the digital model contemporaneously with a display of thevideo stream.
 3. The system of claim 2, the operations furthercomprising determining an observation position in the digital model thatcorresponds to a configurable distance above the lowest horizontaldigital surface and associated with the approximate position of thevirtual camera.
 4. The system of claim 3, the operations furthercomprising casting a plurality of rays horizontally from the observationposition in the digital model and creating a first 2D point cloudincluding a point for each location whereby a ray from the plurality ofrays intersects with a digital surface.
 5. The system of claim 3, theoperations further comprising determining a second observation positionin the video stream that corresponds to the configurable distance abovea lowest horizontal object surface in the video stream.
 6. The system ofclaim 5, the operations further comprising casting a plurality of rayshorizontally from the second observation position and creating a second2D point cloud including a point for each location whereby a ray fromthe plurality of rays intersects an object surface.
 7. The method ofclaim 6, the operations further comprising determining a 2D translation,a 2D scale, and a 2D rotation that maximize an alignment of the firstpoint cloud and the second point cloud and applying the 2D translation,the 2D scale, and the 2D rotation to the digital model.
 8. The system ofclaim 2, wherein the approximate position of the virtual camera in thedigital model is estimated by performing one or more of the following:receiving input data from an input device, wherein the input datadescribes the approximate position in the digital model; and receiving alatitude, a longitude, and an altitude from a global positioning systemassociated with the camera device and computing the approximate positionof the virtual camera based on a comparison with a predeterminedlatitude, longitude, and altitude of a point within the digital model.9. The system of claim 2, the operations further comprising: digitallyprojecting a first plurality of rays in a plurality of directions fromthe approximate position of the virtual camera outwards in the digitalmodel and creating a first 3D point cloud including a point for eachlocation whereby a ray from the first plurality of rays intersects witha digital surface of the one or more digital surfaces; digitallyprojecting a second plurality of rays in a plurality of directions fromthe camera device within the video stream outwards in the video streamdata and creating a second 3D point cloud including a point for eachlocation whereby a ray of the second plurality of rays intersects withan object surface within the one or more object surfaces; anddetermining a 3D translation, a 3D scale, and a 3D rotation thatmaximize an alignment of the first point cloud and the second pointcloud.
 10. The system of claim 9, wherein the computing of the 3Dtranslation, the 3D scale, and the 3D rotation comprises: based on anassumption that a scale of the structure is substantially equal to ascale of the digital model, reducing the computing to a determination ofthe 3D translation and the 3D rotation only.
 11. A non-transitorycomputer-readable storage medium storing a set of instructions that,when executed by one or more computer processors, causes the one or morecomputer processors to perform operations, the operations comprising:receiving a video stream captured by a camera device, the video streamincluding a view of a structure, the structure being associated with adigital model; determining an approximate position of a virtual camerain the digital model, wherein the approximate position corresponds withthe view; determining a position and an orientation for one or moredigital surfaces within the digital model visible from the approximateposition of the virtual camera; determining a position and anorientation of one or more object surfaces visible in the video stream;limiting a rotation of the digital model around a vertical axis,aligning the vertical axis of the digital model with a second verticalaxis determined from the video stream, and determining a verticaltranslation of the digital model so that a lowest horizontal digitalsurface is aligned with a lowest horizontal object surface visible onthe video stream; and applying the vertical translation and the limitingof the rotation to the digital model and displaying the digital modelcontemporaneously with a display of the video stream.
 12. Thenon-transitory computer-readable storage medium of claim 11, theoperations further comprising determining an observation position in thedigital model that corresponds to a configurable distance above thelowest horizontal digital surface and associated with the approximateposition of the virtual camera.
 13. The non-transitory computer-readablestorage medium of claim 12, the operations further comprising casting aplurality of rays horizontally from the observation position in thedigital model and creating a first 2D point cloud including a point foreach location whereby a ray from the plurality of rays intersects with adigital surface.
 14. The non-transitory computer-readable storage mediumof claim 12, the operations further comprising determining a secondobservation position in the video stream that corresponds to theconfigurable distance above a lowest horizontal object surface in thevideo stream.
 15. The non-transitory computer-readable storage medium ofclaim 14, the operations further comprising casting a plurality of rayshorizontally from the second observation position and creating a second2D point cloud including a point for each location whereby a ray fromthe plurality of rays intersects an object surface.
 16. Thenon-transitory computer-readable storage medium of claim 15, theoperations further comprising determining a 2D translation, a 2D scale,and a 2D rotation that maximize an alignment of the first point cloudand the second point cloud and applying the 2D translation, the 2Dscale, and the 2D rotation to the digital model.
 17. The non-transitorycomputer-readable storage medium of claim 11, wherein the approximateposition of the virtual camera in the digital model is estimated byperforming one or more of the following: receiving input data from aninput device, wherein the input data describes the approximate positionin the digital model; and receiving a latitude, a longitude, and analtitude from a global positioning system associated with the cameradevice and computing the approximate position of the virtual camerabased on a comparison with a predetermined latitude, longitude, andaltitude of a point within the digital model.
 18. The non-transitorycomputer-readable storage medium of claim 11, the operations furthercomprising: digitally projecting a first plurality of rays in aplurality of directions from the approximate position of the virtualcamera outwards in the digital model and creating a first 3D point cloudincluding a point for each location whereby a ray from the firstplurality of rays intersects with a digital surface of the one or moredigital surfaces; digitally projecting a second plurality of rays in aplurality of directions from the camera device within the video streamoutwards in the video stream data and creating a second 3D point cloudincluding a point for each location whereby a ray of the secondplurality of rays intersects with an object surface within the one ormore object surfaces; and determining a 3D translation, a 3D scale, anda 3D rotation that maximize an alignment of the first point cloud andthe second point cloud.
 19. The non-transitory computer-readable storagemedium of claim 18, wherein the computing of the 3D translation, the 3Dscale, and the 3D rotation comprises: based on an assumption that ascale of the structure is substantially equal to a scale of the digitalmodel, reducing the computing to a determination of the 3D translationand the 3D rotation only.
 20. A method comprising: receiving a videostream captured by a camera device, the video stream including a view ofa structure, the structure being associated with a digital model;determining an approximate position of a virtual camera in the digitalmodel, wherein the approximate position corresponds with the view;determining a position and an orientation for one or more digitalsurfaces within the digital model visible from the approximate positionof the virtual camera; determining a position and an orientation of oneor more object surfaces visible in the video stream; limiting a rotationof the digital model around a vertical axis, aligning the vertical axisof the digital model with a second vertical axis determined from thevideo stream, and determining a vertical translation of the digitalmodel so that a lowest horizontal digital surface is aligned with alowest horizontal object surface visible on the video stream; andapplying the vertical translation and the limiting of the rotation tothe digital model and displaying the digital model contemporaneouslywith a display of the video stream.
 21. The method of claim 20, furthercomprising: digitally projecting a first plurality of rays in aplurality of directions from the approximate position of the virtualcamera outwards in the digital model and creating a first 3D point cloudincluding a point for each location whereby a ray from the firstplurality of rays intersects with a digital surface of the one or moredigital surfaces; digitally projecting a second plurality of rays in aplurality of directions from the camera device within the video streamoutwards in the video stream data and creating a second 3D point cloudincluding a point for each location whereby a ray of the secondplurality of rays intersects with an object surface within the one ormore object surfaces; and determining a 3D translation, a 3D scale, anda 3D rotation that maximize an alignment of the first point cloud andthe second point cloud.